Impacts of Sample Design for Validation Data on the Accuracy of Feedforward Neural Network Classification
نویسنده
چکیده
Validation data are often used to evaluate the performance of a trained neural network and used in the selection of a network deemed optimal for the task at-hand. Optimality is commonly assessed with a measure, such as overall classification accuracy. The latter is often calculated directly from a confusion matrix showing the counts of cases in the validation set with particular labelling properties. The sample design used to form the validation set can, however, influence the estimated magnitude of the accuracy. Commonly, the validation set is formed with a stratified sample to give balanced classes, but also via random sampling, which reflects class abundance. It is suggested that if the ultimate aim is to accurately classify a dataset in which the classes do vary in abundance, a validation set formed via random, rather than stratified, sampling is preferred. This is illustrated with the classification of simulated and remotely-sensed datasets. With both datasets, statistically significant differences in the accuracy with which the data could be classified arose from the use of validation sets formed via random and stratified sampling (z = 2.7 and 1.9 for the simulated and real datasets respectively, for both p < 0.05%). The accuracy of the classifications that used a stratified sample in validation were smaller, a result of cases of an abundant class being commissioned into a rarer class. Simple means to address the issue are suggested.
منابع مشابه
Global Solar Radiation Prediction for Makurdi, Nigeria Using Feed Forward Backward Propagation Neural Network
The optimum design of solar energy systems strongly depends on the accuracy of solar radiation data. However, the availability of accurate solar radiation data is undermined by the high cost of measuring equipment or non-functional ones. This study developed a feed-forward backpropagation artificial neural network model for prediction of global solar radiation in Makurdi, Nigeria (7.7322 N lo...
متن کاملShape optimization of impingement and film cooling holes on a flat plate using a feedforward ANN and GA
Numerical simulations of a three-dimensional model of impingement and film cooling on a flat plate are presented and validated with the available experimental data. Four different turbulence models were utilized for simulation, in which SST had the highest precision, resulting in less than 4% maximum error in temperature estimation. A simplified geometry with periodic boundary conditions is de...
متن کاملA New Method for Intrusion Detection Using Genetic Algorithm and Neural Network
The article attempts to have neural network and genetic algorithm techniques present a model for classification on dataset. The goal is design model can the subject acted a firewall in network and this model with compound optimized algorithms create reliability and accuracy and reduce error rate couse of this is article use feedback neural network and compared to previous methods increase a...
متن کاملA New Method for Intrusion Detection Using Genetic Algorithm and Neural Network
The article attempts to have neural network and genetic algorithm techniques present a model for classification on dataset. The goal is design model can the subject acted a firewall in network and this model with compound optimized algorithms create reliability and accuracy and reduce error rate couse of this is article use feedback neural network and compared to previous methods increase a...
متن کاملDetermination of Best Supervised Classification Algorithm for Land Use Maps using Satellite Images (Case Study: Baft, Kerman Province, Iran)
According to the fundamental goal of remote sensing technology, the image classification of desired sensors can be introduced as the most important part of satellite image interpretation. There exist various algorithms in relation to the supervised land use classification that the most pertinent one should be determined. Therefore, this study has been conducted to determine the best and most su...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017